Investigating the Complementarity of Spectral and Spectro-temporal Features
نویسندگان
چکیده
Most common speech features as Mel Ceptstral Coefficients (MFCCs) and RelAtive SpecTrAl Perceptual Linear Predictive RASTA-PLP features use only spectral information. However, from measurements in the mammalian auditory cortex it is known that the mammalian brain jointly uses spectral and temporal information. To model this we previously developed Hierarchical SpectroTemporal (HIST) features [1, 2, 3]. They consist of two layers, the first capturing local spectro-temporal variations and the second integrating them into larger receptive fields. This layout was inspired by a recently proposed system for visual object recognition [4].
منابع مشابه
Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain
This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...
متن کاملImproved phoneme recognition by integrating evidence from spectro-temporal and cepstral features
Gabor features have been proposed for extracting spectro-temporal modulation information, and yielding significant improvements in recognition performance. In this paper, we propose the integration of Gabor posteriors with MFCC posteriors, yielding a relative improvement of 14.3% over an MFCC Tandem system. We analyze for different types of acoustic units the complementarity between Gabor featu...
متن کاملImproved Tonal Language Speech Recognition by Integrating Spectro-Temporal Evidence and Pitch Information with Properly Chosen Tonal Acoustic Units
We propose an improved Tandem system for tonal language speech recognition. Three different types of features, cepstral, spectro-temporal and pitch features, are integrated for modeling tone and phoneme variation simultaneously. Tonal phonemes (or tonemes) are used for MLP posterior estimation, and tonal acoustic units for HMM recognition. In our experiments conducted on Mandarin broadcast news...
متن کاملModeling spectro-temporal modulation perception in normal-hearing listeners
The ability of human listeners to detect and discriminate spectro-temporal ripples in sound has been shown to be correlated with speech intelligibility performance in several conditions. Thus, if a model would be able to account for the spectro-temporal processing limits in the auditory system, such a framework could be used to analyze the auditory processes contributing to and limiting speech ...
متن کاملSpectro-temporal modulation subspace-spanning filter bank features for robust automatic speech recognition.
In an attempt to increase the robustness of automatic speech recognition (ASR) systems, a feature extraction scheme is proposed that takes spectro-temporal modulation frequencies (MF) into account. This physiologically inspired approach uses a two-dimensional filter bank based on Gabor filters, which limits the redundant information between feature components, and also results in physically int...
متن کامل